Synthetic data generator for testing record linkage routines in Brazil.
نویسندگان
چکیده
منابع مشابه
Probabilistic Linkage of Persian Record with Missing Data
Extended Abstract. When the comprehensive information about a topic is scattered among two or more data sets, using only one of those data sets would lead to information loss available in other data sets. Hence, it is necessary to integrate scattered information to a comprehensive unique data set. On the other hand, sometimes we are interested in recognition of duplications in a data set. The i...
متن کاملImproved record linkage for encrypted identifying data
The health data integration project at the E-Health Research Centre is researching ways of improving the integration of health and health related data while maintaining the privacy and security of the data. One such method is to improve the mechanisms of matching patients across databases when the identifying information must not be revealed, even during the linkage step. Background: With healt...
متن کاملData Fusion with Record Linkage
Assuming that there are two sources (e.g. les), which consist of records with diierent informations about some units like people. We want to fusion the information (data) that belong to the same units. Very often in practice no identiication numbers | like the Social Security Number SSN | are available at both les, that's why there is some uncertainity, which records belong together. Anyway, we...
متن کاملAn Ensemble Approach for Record Matching in Data Linkage
OBJECTIVES To develop and test an optimal ensemble configuration of two complementary probabilistic data matching techniques namely Fellegi-Sunter (FS) and Jaro-Wrinkler (JW) with the goal of improving record matching accuracy. METHODS Experiments and comparative analyses were carried out to compare matching performance amongst the ensemble configurations combining FS and JW against the two t...
متن کاملTheory for Record Linkage
person or event or whether there is insufficient evidence to justify either of these decisions at stipulated levels of error These three decisions are referred to as link A1 non-link A3 and possible link A2 The first two decisions are called positive dispositions The two types of error are defined as the error of the decision when the members of the comparison pair are in fact unmatched and the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Population Data Science
سال: 2018
ISSN: 2399-4908
DOI: 10.23889/ijpds.v3i4.722